Privacy Preserving Database Generation for Database Application Testing

نویسندگان

  • Xintao Wu
  • Yongge Wang
  • Songtao Guo
  • Yuliang Zheng
چکیده

Testing of database applications is of great importance. Although various studies have been conducted to investigate testing techniques for database design, relatively few efforts have been made to explicitly address the testing of database applications which requires a large amount of representative data available. As testing over live production databases is often infeasible in many situations due to the high risks of disclosure of confidential information or incorrect updating of real data, in this paper we investigate the problem of generating synthetic databases based on a-priori knowledge about production databases. Our approach is to fit the general location model using various characteristics (e.g., constraints, statistics, rules) extracted from a production database and then generate synthetic data using model learned. The generated data is valid and similar to real data in terms of statistical distribution, hence it can be used for functional and performance testing. As characteristics extracted may contain information which may be used by attackers to derive some confidential information about individuals, we present our disclosure analysis method which applies cell suppression technique for identity disclosure and perturbation for value disclosure analysis.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Privacy Preserving Data Generation for Database Application Performance Testing

Synthetic data plays an important role in software testing. In this paper, we initiate the study of synthetic data generation models for the purpose of application software performance testing. In particular, we will discuss models for protecting privacy in synthetic data generations. Within this model, we investigate the feasibility and techniques for privacy preserving synthetic database gene...

متن کامل

An Effective Method for Utility Preserving Social Network Graph Anonymization Based on Mathematical Modeling

In recent years, privacy concerns about social network graph data publishing has increased due to the widespread use of such data for research purposes. This paper addresses the problem of identity disclosure risk of a node assuming that the adversary identifies one of its immediate neighbors in the published data. The related anonymity level of a graph is formulated and a mathematical model is...

متن کامل

Separating indexes from data: a distributed scheme for secure database outsourcing

Database outsourcing is an idea to eliminate the burden of database management from organizations. Since data is a critical asset of organizations, preserving its privacy from outside adversary and untrusted server should be warranted. In this paper, we present a distributed scheme based on storing shares of data on different servers and separating indexes from data on a distinct server. Shamir...

متن کامل

Distributed Pseudo-Random Number Generation and Its Application to Cloud Database

Cloud database is now a rapidly growing trend in cloud computing market recently. It enables the clients run their computation on out-sourcing databases or access to some distributed database service on the cloud. At the same time, the security and privacy concerns is major challenge for cloud database to continue growing. To enhance the security and privacy of the cloud database technology, th...

متن کامل

Statistical Database Modeling for Privacy Preserving Database Generation

Testing of database applications is of great importance. Although various studies have been conducted to investigate testing techniques for database design, relatively few efforts have been made to explicitly address the testing of database applications which requires a large amount of representative data available.As testing over live production databases is often infeasible in many situations...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Fundam. Inform.

دوره 78  شماره 

صفحات  -

تاریخ انتشار 2007